Vox Populi: Collecting High-Quality Labels from a Crowd
نویسندگان
چکیده
With the emergence of search engines and crowdsourcing websites, machine learning practitioners are faced with datasets that are labeled by a large heterogeneous set of teachers. These datasets test the limits of our existing learning theory, which largely assumes that data is sampled i.i.d. from a fixed distribution. In many cases, the number of teachers actually scales with the number of examples, with each teacher providing just a handful of labels, precluding any statistically reliable assessment of an individual teacher’s quality. In this paper, we study the problem of pruning low-quality teachers in a crowd, in order to improve the label quality of our training set. Despite the hurdles mentioned above, we show that this is in fact achievable with a simple and efficient algorithm, which does not require that each example be repeatedly labeled by multiple teachers. We provide a theoretical analysis of our algorithm and back our findings with empirical evidence.
منابع مشابه
Vox Populi: An Interactive Evolutionary System for Algorithmic Music Composition
While recent techniques of digital sound synthesis have put numerous new sounds on the musician’s desktop, several artificial-intelligence (AI) techniques have also been applied to algorithmic composition. This article introduces Vox Populi, a system based on evolutionary computation techniques for composing music in real time. In Vox Populi, a population of chords codified according to MIDI pr...
متن کاملThe Wisdom of Crowds (Vox Populi) and Antidepressant Use
Under certain conditions, groups of people may (collectively) make better judgments than experts. Galton connected this phenomenon to the phrase vox populi in a 1907 paper. Arguably, an example of the phenomenon may be found in recent stabilization of the frequency of antidepressant use, following decades of increases. There is no evidence that a change in physi-cian behaviour has caused this s...
متن کاملVOX POPULI: Automatic Generation of Biased Video Sequences
We describe our experimental rhetoric engine Vox Populi that generates biased video-sequences from a repository of video interviews and other related audio-visual web sources. Users are thus able to explore their own opinions on controversial topics covered by the repository. The repository contains interviews with United States residents stating their opinion on the events occurring after the ...
متن کاملA new approach to relevancy in Internet searching - the "Vox Populi Algorithm"
In this paper we will derive a new algorithm for Internet searching. The main idea of this algorithm is to extend the existing algorithms by a component, which reflects the interests of the users more than existing methods. The “Vox Populi Algorithm” (VPA) [1] creates a feedback from the users to the content of the search index. The information derived from the users query analysis is used to m...
متن کاملMedical marijuana, compassionate use, and public policy: Expert opinion or vox populi?
Spend your few moment to read a book even only few pages. Reading book is not obligation and force for everybody. When you don't want to read, you can get punishment from the publisher. Read a book becomes a choice of your different characteristics. Many people with reading habit will always be enjoyable to read, or on the contrary. For some reasons, this medical marijuana compassionate use and...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
دوره شماره
صفحات -
تاریخ انتشار 2009